feat!: common chart v2.0.0 by Glenn-Terjesen · Pull Request #242 · entur/helm-charts

Glenn-Terjesen · 2026-03-25T13:26:47Z

Common Chart v2.0.0

Upgrade guide

See UPGRADE.md for the full migration guide.

Paste this into Claude Code, Copilot, Cursor, or any AI coding agent from your application's repo:

Upgrade the Entur common Helm chart dependency from v1 to v2.

Read the upgrade skill and follow its instructions:
  https://raw.githubusercontent.com/entur/helm-charts/main/.claude/skills/upgrade-common-chart/SKILL.md

Apply all migration steps to every values file in this repository.
Run `helm dependency update` and `helm lint` to verify.

Breaking Changes

v1	v2	Notes
`shortname`	`appId`	Matches GoogleCloudApplication `metadata.id`
`container.replicas`	`deployment.minReplicas`	HPA controls pod count, Helm never resets it
`deployment.replicas`	`deployment.minReplicas`	Same — renamed for clarity
`container.maxReplicas`	`deployment.maxReplicas`	Moved to deployment
`container.forceReplicas`	`deployment.forceReplicas`	Moved to deployment
`container.minAvailable`	`deployment.minAvailable`	Moved to deployment
`container.memoryLimit`	removed	Memory limit = memory request always
`pdb.minAvailable`	`deployment.minAvailable`	Single place to configure
`postgres.connectionConfig`	`postgres.enabled: true`	`secretKeyPrefix` integration via External Secrets
`postgres.instances: [PGINSTANCES]`	`postgres.instances: [{secretKeyPrefix: PG}]`	Or just `enabled: true` for default PG prefix
`postgres.memoryLimit`	removed	Use `postgres.memory`
`postgres.termTimeout`	`postgres.maxSigtermDelay`	Renamed to match Cloud SQL Proxy v2 flag
`kubernetes.io/ingress.class`	`ingress.ingressClassName`	K8s standard field
`container.cpu: 0.1`	`container.cpu: 0.3`	JVM-friendly default; override down for non-JVM workloads
`container.memory: 16`	`container.memory: 512`	Sized for JVM startup (~150–250 MiB just for the JVM); override down for lighter workloads

What's New

Postgres secretKeyPrefix integration — The secretKeyPrefix is now the single contract between the Helm chart and the entur/terraform-google-sql-db Terraform module. Given a prefix (default PG), the chart derives all Secret Manager keys ({prefix}INSTANCES, {prefix}USER, {prefix}PASSWORD) and fetches everything via External Secrets. The simplest case is just postgres.enabled: true. Multiple instances are supported via the instances list. The chart generates {prefix}HOST=localhost and {prefix}PORT=5432+index as env vars. Terraform-created K8s secrets are no longer needed.

# Single instance (simplest)
postgres:
  enabled: true

# Multiple instances
postgres:
  enabled: true
  instances:
    - secretKeyPrefix: PG
    - secretKeyPrefix: ANALYTICS_PG

JVM-friendly resource defaults — container.cpu raised from 0.1 to 0.3 and container.memory from 16 to 512 (Mi). The old defaults were stub values that would OOMKill any JVM app before it finished booting. ~90% of Entur services are Spring Boot, so the new defaults match the common case. Lighter workloads (sidecars, small Go services, static frontends) should override down.

HPA always enabled — HPA runs in all environments (not just prd). Default minReplicas: 2 everywhere. Deployment spec never emits replicas, so helm upgrade can't reset HPA-managed pod counts. Use forceReplicas to opt out.

PDB fixes — unhealthyPodEvictionPolicy: AlwaysAllow prevents unhealthy pods from blocking cluster upgrades. forceReplicas > 1 now correctly gets PDB protection.

Cloud SQL Proxy v2 — Upgraded to v2 (2.21.2). Prometheus metrics on port 9801. Configurable shutdown delay via postgres.maxSigtermDelay.

GKE Startup CPU Boost — Alfa version - Optional (deployment.startupCPUBoost.enabled). Temporarily increases CPU during startup, reverts when pod is Ready. Auto-sets CPU limit to 1.3x request. NB Not ready yet!

Native gRPC probes — grpc: true now uses K8s native gRPC probes with service.internalPort. No need for /bin/grpc_health_probe binary or manual port config.

Custom HPA metrics — hpa.metrics list for Pods/External/Object metrics alongside default CPU. ScaleUp stabilization window (120s default) when CPU boost is disabled.

JSON Schema validation — values.schema.json catches typos and unknown properties on helm lint. IDE autocompletion in VS Code and JetBrains.

Helm 3 + 4 — CI tests all run against both Helm v3.20.0 and v4.1.3.

Other Improvements

Startup probe supports path for httpGet (Allow path for startup probe #237)
deployment.cpuUtilization (default 70%) replaces top-level cpuUtilization (HPA averageUtilization should be placed under deployment #221)
Per-ingress annotations and ingressClassName support
Cron: added seccompProfile, fixed postgres proxy placement
Rolling update defaults: maxSurge: 1, maxUnavailable: 1
Memory limit always equals memory request (no more 1.2x multiplier)
Fixed memoryLimt typo in postgres proxy helper
Removed dead grpcexecprobes helper
Example chart READMEs now use README.md.gotmpl templates with hand-written narrative (purpose, when to use, key values) that survives helm-docs regeneration; values tables remain auto-generated

CI

Unit tests + install tests + example validation on both Helm 3 and Helm 4
Examples validated against local chart (not published repo)
helm lint added to example validation
kube-startup-cpu-boost operator installed in kind cluster
helm-docs workflow now triggers correctly on release-please PRs (was filtering the wrong branch direction); concurrency group added to avoid races with release-please regenerations

Closes

Closes #101, closes #126, closes #195, closes #221, closes #225, closes #235, closes #237

Generated with Claude Code using Claude Opus 4.6

…s eviction of unhealthy pods - Add unhealthyPodEvictionPolicy: AlwaysAllow to prevent unhealthy pods from blocking node drains during cluster upgrades - Fix forceReplicas > 1 getting minAvailable 0% (all pods evictable) - Fix replicas=1 with HPA getting minAvailable 0% despite 2+ pods running - PDB logic now checks effective replicas: forceReplicas, HPA min replicas, or configured replicas Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…nnotations - Add missing seccompProfile to cron pod securityContext (matches deployment) - Move postgres proxy outside container loop in cron to prevent duplicate sidecars - Support per-ingress annotations when using ingresses list Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…l Secrets - Upgrade cloud-sql-proxy from v1 (1.33.16) to v2 (2.21.2) - Update image, executable name, and CLI flags for v2 - Fix memoryLimt typo in helpers, deprecate memoryLimit (limit = request) - Replace configmap-based connection config with External Secrets - Add postgres.instances to configure Secret Manager keys for instance connection names - Support multiple SQL databases via indexed CSQL_PROXY_INSTANCE_CONNECTION_NAME_N env vars - Deprecate postgres.connectionConfig with fail message - Fix deployment.prometheus.path not falling back to default (#225) BREAKING CHANGE: postgres.connectionConfig is removed in favor of postgres.instances. Users must migrate their connection config from Kubernetes ConfigMaps to Secret Manager keys via External Secrets (e.g. postgres.instances: [PGINSTANCES]).

Add deployment.cpuUtilization as the preferred location for HPA CPU target utilization. Falls back to top-level cpuUtilization for backwards compatibility, then to default 100%.

- Add StartupCPUBoost CRD resource (enabled by default, 50% increase) - Boost targets pods by app label, reverts when pod becomes Ready - Lower default HPA cpuUtilization from 100% to 70% (best practice when startup CPU spikes are handled by the boost operator) - Requires kube-startup-cpu-boost operator installed in the cluster

When container.probes.startup.path is set, the startup probe uses httpGet instead of tcpSocket. This enables custom startup health checks for apps with long-running startup tasks like cache warming.

Setting grpc: true now uses native K8s gRPC probes with internalPort by default. No need to manually set probe ports for each probe. Removes fallback to exec-based grpc_health_probe which required the binary in the container image.

…ssName (#126) Replace kubernetes.io/ingress.class annotation (deprecated since K8s 1.18) with spec.ingressClassName. Defaults to "traefik", configurable via ingress.ingressClassName or per-ingress ingressClassName field.

Add appId field matching GoogleCloudApplication metadata.id. Falls back to shortname for backwards compatibility. Adds new "appId" label to all resources alongside existing "shortname" label.

Add hpa.metrics list for appending custom metrics (Pods, External, Object) alongside the default CPU utilization metric. Supports Prometheus/GMP gauges, Pub/Sub queue depth, and any Cloud Monitoring metric. The existing hpa.spec override for full control is preserved.

- Remove unused grpcexecprobes helper (native K8s gRPC probes are now default) - Remove shortname value entirely — appId is now required with fail message for migration - Remove top-level cpuUtilization fallback, use only deployment.cpuUtilization - Fix readiness probe comment typo (said "liveness") - Rename shortname to appId in all fixtures and test values BREAKING CHANGE: shortname is removed. Use appId instead.

…theus metrics - Remove container.replicas, container.maxReplicas, container.forceReplicas, container.minAvailable, container.terminationGracePeriodSeconds — these are now only under deployment.* where they belong - Remove container.* fallbacks from deployment.yaml, hpa.yaml, pdb.yaml - Enable prometheus metrics on Cloud SQL proxy v2 (--http-port=9801 --prometheus) exposing metrics at :9801/metrics for monitoring proxy health and connections - Update all tests and fixtures to use deployment.* for scaling fields BREAKING CHANGE: container.replicas, container.maxReplicas, container.forceReplicas, container.minAvailable, and container.terminationGracePeriodSeconds are removed. Use deployment.replicas, deployment.maxReplicas, etc. instead.

When postgres is enabled, adds prometheus.io/scrape-sql-proxy, prometheus.io/sql-proxy-port (9801), and prometheus.io/sql-proxy-path (/metrics) annotations to pods. Allows configuring Prometheus to scrape the SQL proxy sidecar alongside the main application.

Tests were still asserting livenessProbe.exec.command from the removed grpcexecprobes helper. Updated to assert livenessProbe.grpc which is the native K8s gRPC probe now used by default.

Add the StartupCPUBoost Helm chart to the CI kind cluster setup so the StartupCPUBoost CRD is available during helm install tests.

…use appId

…repo Copy the local common chart into each example's charts/ directory before running helm template. This tests examples against the current branch's chart instead of the published version.

…-backend - Add values-kub-ent-tst.yaml and values-kub-ent-prd.yaml for grpc-app example so CI can validate all environments - Add postgres.instances to typical-backend example (required by v2 proxy)

Add values.schema.json matching v2 values structure. Validates values on helm install/upgrade/template/lint to catch typos and unknown properties early. Updated from PR #222 for v2 changes: appId instead of shortname, scaling fields under deployment only, postgres.instances, hpa.metrics, startupCPUBoost, ingressClassName, startup probe path.

…E.md

Run unit tests, kind cluster install tests, and example validation against both Helm 3 and Helm 4 using matrix strategy.

…isabled When CPU boost is off, Java startup CPU spikes can trigger unnecessary HPA scale-ups. A 120s stabilization window gives pods time to finish startup before HPA acts on the elevated CPU. When CPU boost is enabled the window is not needed since startup spikes are handled by the boost.

…ionWindowSeconds Defaults to 120s when startupCPUBoost is disabled. Tune to match your application's typical startup time (e.g. 60s for a fast app, 300s for a heavy Spring Boot app with cache warming).

When CPU boost is enabled and no explicit cpuLimit is set, the CPU limit is automatically set to 130% of the CPU request. This gives the boost operator a ceiling to work within. Explicit cpuLimit always takes precedence.

…resources

… boost

Memory limit is now always equal to memory request. The previous 1.2x multiplier and memoryLimit override are removed. container.memoryLimit is deprecated with a note to use container.memory instead. BREAKING CHANGE: container.memoryLimit is removed. Memory limit now always equals memory request. Set container.memory to the value you need.

HPA is now always enabled (unless forceReplicas is set). The Deployment spec never emits replicas — HPA controls pod count in all environments. Default minReplicas by environment: - sbx/dev/tst: 1 (scales down to single pod in low traffic) - prd: 2 (HA by default) deployment.replicas overrides the default minReplicas for any env. PDB protection follows minReplicas: 0% when minReplicas=1, 50% when >=2. This prevents the v1 bug where helm upgrade would reset HPA-managed replica counts back to the configured value. BREAKING CHANGE: HPA is now enabled by default in all environments. Use deployment.forceReplicas to opt out of HPA.

deployment.replicas is removed. Use deployment.minReplicas to set the HPA minimum replica count, or deployment.forceReplicas to disable HPA. Since HPA is always enabled, the Deployment spec never emits replicas — HPA controls the pod count. This prevents helm upgrade from resetting HPA-managed replica counts. BREAKING CHANGE: deployment.replicas is removed. Use deployment.minReplicas (sets HPA minimum) or deployment.forceReplicas (disables HPA).

…it removal

…db.minAvailable - Default minReplicas to 2 in all environments (no more env-aware branching) - Remove pdb.minAvailable — use deployment.minAvailable instead (one place to configure) - Change maxSurge and maxUnavailable defaults from 25% to 1 (works correctly with 2 replicas) - PDB automatically 50% when minReplicas >= 2, 0% when minReplicas = 1 BREAKING CHANGE: pdb.minAvailable is removed. Use deployment.minAvailable instead. Default minReplicas is now 2 in all environments (was 1 in dev/tst). Default maxSurge and maxUnavailable changed from 25% to 1.

# Conflicts: # .github/workflows/pull-request.yml # charts/common/Chart.yaml # charts/common/templates/_helpers.tpl # examples/common/cronjob/Chart.yaml # examples/common/grpc-app/Chart.yaml # examples/common/multi-container/Chart.yaml # examples/common/multi-deploy/Chart.yaml # examples/common/simple-app/Chart.yaml # examples/common/typical-backend/Chart.yaml # examples/common/typical-frontend/Chart.yaml

Replace the split credential model (ExternalSecrets for proxy connection names + Terraform-created K8s secrets for credentials) with a unified approach where secretKeyPrefix is the single contract between Terraform and Helm. Given a prefix (default PG), the chart derives all Secret Manager keys ({prefix}INSTANCES, {prefix}USER, {prefix}PASSWORD) and fetches everything via ExternalSecrets. The simplest case is just `postgres.enabled: true`. Changes: - postgres.instances now takes objects with secretKeyPrefix instead of raw Secret Manager key names - New sql-credentials ExternalSecret fetches {prefix}USER and {prefix}PASSWORD from Secret Manager - Chart generates {prefix}HOST=localhost and {prefix}PORT=5432+index as env vars (no longer fetched from Secret Manager) - Proxy command adds --port=5432 for deterministic port assignment - postgres.termTimeout renamed to postgres.maxSigtermDelay to match the Cloud SQL Proxy v2 flag - Removed v1 compatibility tests and deprecated fields (connectionConfig, memoryLimit)

Claude Code skill that automates migrating Helm values files from common chart v1 to v2. Covers all breaking changes: shortname to appId, scaling field moves, postgres secretKeyPrefix integration, memoryLimit removal, and configmap.toEnv removal.

Default container.cpu raised from 0.1 to 0.3 and container.memory from 16 to 512 (Mi). The previous defaults were stub values that would OOMKill any JVM app before it finished booting; the JVM alone needs ~150-250 MiB before app code runs, and ~90% of Entur services are Spring Boot. Existing values that omit container.cpu / container.memory will see 3x CPU and 32x memory requests on next deploy. Override down for non-JVM workloads (sidecars, small Go services, static frontends). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The pull_request branches filter was matching the wrong direction (base instead of head), so the workflow never triggered on release-please PRs. Switch to filtering base on main and gating the job with an if: condition on github.head_ref. Also fix: - checkout uses github.head_ref instead of github.ref (which is the ephemeral merge ref on pull_request events) - drop redundant git switch (checkout already lands on the branch) - move the printf below the VERSION export - replace inline shell substitution in jq and yq filters with --arg / strenv for safer interpolation - quote $CUR_CHART in find - add a concurrency group to avoid races with release-please regenerations Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add README.md.gotmpl per example so the narrative survives helm-docs regeneration. Each template renders the chart header and badges, then custom what / when / key-values sections, then the auto-generated requirements + values tables. The narrative explains each example's purpose in plain language, with cross-references between them (e.g. cronjob points to multi-deploy for event-driven work; multi-container points to multi-deploy as the preferred alternative when processes can run independently). A short gloss of "ingress" is included where relevant. Also fix a TZx -> TZ typo in typical-frontend/values.yaml so the configmap example matches the standard timezone env var. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Glenn-Terjesen and others added 16 commits March 25, 2026 11:49

docs: add AGENTS.md and symlink CLAUDE.md

f3b050b

feat: move cpuUtilization under deployment (#221)

af60a30

Add deployment.cpuUtilization as the preferred location for HPA CPU target utilization. Falls back to top-level cpuUtilization for backwards compatibility, then to default 100%.

feat: allow path for startup probe (#237)

684abcf

When container.probes.startup.path is set, the startup probe uses httpGet instead of tcpSocket. This enables custom startup health checks for apps with long-running startup tasks like cache warming.

docs: add downtime warning for ingress trafficType changes (#235)

473e502

feat: add appId as preferred input, deprecate shortname

8ca4d06

Add appId field matching GoogleCloudApplication metadata.id. Falls back to shortname for backwards compatibility. Adds new "appId" label to all resources alongside existing "shortname" label.

docs: update AGENTS.md with tools and expanded commands

a9f0915

Glenn-Terjesen requested a review from a team as a code owner March 25, 2026 13:26

Glenn-Terjesen added 13 commits March 25, 2026 14:29

fix: update gRPC tests to assert native probes instead of exec-based

b69e510

Tests were still asserting livenessProbe.exec.command from the removed grpcexecprobes helper. Updated to assert livenessProbe.grpc which is the native K8s gRPC probe now used by default.

ci: install kube-startup-cpu-boost operator in kind cluster

f16fb63

Add the StartupCPUBoost Helm chart to the CI kind cluster setup so the StartupCPUBoost CRD is available during helm install tests.

ci: wait for startup-cpu-boost webhook to be ready before install tests

8234390

fix: increase startup-cpu-boost timeout to 5m and update examples to …

dd4d985

…use appId

fix: update agents and ignore asdf

847914f

ci: use local charts/common for example validation instead of remote …

a6ac7c4

…repo Copy the local common chart into each example's charts/ directory before running helm template. This tests examples against the current branch's chart instead of the published version.

fix: add missing grpc-app env files and postgres.instances to typical…

d38bf5c

…-backend - Add values-kub-ent-tst.yaml and values-kub-ent-prd.yaml for grpc-app example so CI can validate all environments - Add postgres.instances to typical-backend example (required by v2 proxy)

docs: add UPGRADE.md with v1 to v2 migration guide

c6cd395

ci: add helm lint to example validation

181c982

docs: add IDE setup instructions for values schema and link to UPGRAD…

3c93d95

…E.md

docs: add helm lint step to UPGRADE.md migration checklist

f961f36

ci: test with both Helm v3.20.0 and v4.1.3

83a3995

Run unit tests, kind cluster install tests, and example validation against both Helm 3 and Helm 4 using matrix strategy.

Glenn-Terjesen added 16 commits March 25, 2026 21:21

fix: default startupCPUBoost to disabled

dff475b

feat: make stabilizationWindowSeconds configurable via hpa.stabilizat…

33e786b

…ionWindowSeconds Defaults to 120s when startupCPUBoost is disabled. Tune to match your application's typical startup time (e.g. 60s for a fast app, 300s for a heavy Spring Boot app with cache warming).

docs: mention auto CPU limit in startupCPUBoost comment

2909b6d

fix: round CPU boost limit to 2 decimals and pass boost flag to cron …

2026884

…resources

docs: add comment explaining CPU boost limit calculation

ed12198

test: add tests for CPU boost auto-limit, explicit override, and cron…

5df408d

… boost

docs: update UPGRADE.md for minReplicas, HPA-always-on, and memoryLim…

e089531

…it removal

fix: update test names and comments to match v2 behavior

a86fd9f

docs: regenerate README.md files with helm-docs for v2

5127b8f

docs: update AGENTS.md with helm-docs workflow and v2 conventions

0edbdae

Glenn-Terjesen changed the title ~~feat!: common chart v2~~ feat!: common chart v2.0.0 Mar 26, 2026

Glenn-Terjesen and others added 13 commits March 26, 2026 11:00

ci: use explicit release names to avoid name collision in install tests

5041f20

docs: simplify ingress.class section in UPGRADE.md

809ecaa

docs: trim ingress.class section

c39014f

docs: update AGENTS.md for postgres secretKeyPrefix changes

13c51c4

docs: reference upgrade-common-chart skill in UPGRADE.md

5446f14

docs: clarify how to use upgrade skill from other repos

1326473

docs: simplify upgrade instructions to reference public skill URL

1899955

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat!: common chart v2.0.0#242

feat!: common chart v2.0.0#242
Glenn-Terjesen wants to merge 67 commits intomainfrom
v2

Glenn-Terjesen commented Mar 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Glenn-Terjesen commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Common Chart v2.0.0

Upgrade guide

Breaking Changes

What's New

Other Improvements

CI

Closes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Glenn-Terjesen commented Mar 25, 2026 •

edited

Loading